What is negative binomial distribution?

The negative binomial distribution is a discrete probability distribution that models the number of trials needed for a specified number of successes in a sequence of independent trials.

  • Definition: The negative binomial distribution describes the probability of observing r successes in x trials, with the last trial resulting in a success, given a probability of success p on each trial.

  • Parameters: It is parameterized by r (the number of successes) and p (the probability of success). r must be a positive integer, and p must be a value between 0 and 1.

  • Probability Mass Function (PMF): The PMF gives the probability of observing exactly x trials to get r successes.

  • Relationship to other distributions: The negative binomial distribution is a generalization of the geometric distribution (when r=1). As r increases, it can be approximated by the normal distribution.

  • Applications: It's used in various fields, including ecology (modeling species distribution), epidemiology (modeling disease outbreaks), and business (modeling customer arrivals).

  • Mean and Variance: The mean of the negative binomial distribution is r/p, and the variance is r(1-p)/p².

  • Overdispersion: The negative binomial distribution is often used to model count data when the variance exceeds the mean, a phenomenon called overdispersion, which is often encountered in real-world data.